Comparing machine learning and deep learning regression frameworks for accurate prediction of dielectrophoretic force

An intelligent sensing framework using Machine Learning (ML) and Deep Learning (DL) architectures to precisely quantify dielectrophoretic force invoked on microparticles in a textile electrode-based DEP sensing device is reported. The prediction accuracy and generalization ability of the framework was validated using experimental results. Images of pearl chain alignment at varying input voltages were used to build deep regression models using modified ML and CNN architectures that can correlate pearl chain alignment patterns of Saccharomyces cerevisiae(yeast) cells and polystyrene microbeads to DEP force. Various ML models such as K-Nearest Neighbor, Support Vector Machine, Random Forest, Neural Networks, and Linear Regression along with DL models such as Convolutional Neural Network (CNN) architectures of AlexNet, ResNet-50, MobileNetV2, and GoogLeNet have been analyzed in order to build an effective regression framework to estimate the force induced on yeast cells and microbeads. The efficiencies of the models were evaluated using Mean Absolute Error, Mean Absolute Relative, Mean Squared Error, R-squared, and Root Mean Square Error (RMSE) as evaluation metrics. ResNet-50 with RMSPROP gave the best performance, with a validation RMSE of 0.0918 on yeast cells while AlexNet with ADAM optimizer gave the best performance, with a validation RMSE of 0.1745 on microbeads. This provides a baseline for further studies in the application of deep learning in DEP aided Lab-on-Chip devices.

www.nature.com/scientificreports/ where a is the radius of spherical microparticles in a medium, under an alternating current (AC) field E rms ; F DEP depends on the product of the localized field with its gradient ( ∇E 2 rms ) and the frequency-dependent complex dielectric contrast of the particle versus the medium, as given by real-part of the Clausius-Mossoti factor Re f cm ; where ε * p and ε * m are the complex permittivities of the microparticles and the medium, respectively; and given as: where i = √ −1 and ω is the angular frequency of the applied AC field. The DEP fingerprints, or dielectric properties, of a particle in a certain media can be determined by altering the AC signal frequency. Particles can be manipulated once their DEP fingerprints have been discovered. For particle chains, F DEP can be theoretically calculated 55 using multipole re-expansion and the method of images. F DEP on a particle chain is highly dependent on the angle between the applied field and the chain. The maximum attractive and repulsive forces in a chain grow significantly with the number of particles in the chain, but when the number of particles is large enough, they reach saturation. This F DEP was analytically calculated and given as 17 : www.nature.com/scientificreports/ where ε E the relative permittivity of the medium, E is the electric field on the microparticle surface, E n is the normal component of E, and n is the unit normal vector on the surface. However, theoretical estimate of F DEP from micrographs is problematic due to discrepancies in the thread structure. At the microscopic level, the orientation of textile strands differs greatly. The calculated force becomes ambiguous as a result. The applied voltages are predicted by examining the patterns of pearl chain orientation. From the micrographs collected, the deep learning regression algorithms predicted the applied voltage on particles. The force on a chain of spherical dielectric particles in a dielectric fluid is proportional to the number of particles as well as its orientation to the electric field, according to various studies 55,56 . As a result, a direct link between the applied voltage and the pearl chain formation has been established.
Problem formulation. Let us assume that the j − th image is defined in an input space x j ∈ X , and there is an output space y i ∈ Y = {u 1 , u 2 , · · ·, u k } with sorted ranks u k ≫ u k−1 ≫ · · · ≫ u 1 . The symbol ≫ represents how different rankings are ordered. Given a training dataset χ = {x i , y i } N i=1 , the goal of regression is to create a mapping from pearl chain images to ranks g(.) : X → Y such that the risk functional R(g) is minimized using a specified cost c : X × Y → R . The cost matrix C is used to calculate the difference in cost between predicted and ground-truth ranks in this research 53 . C is a K × K matrix with C y,u denoting the cost of predicting a sample (x, y) with rank u . Normally, when u = y , C y,u > 0 and C y,y = 0 are assumed. For general regression issues, the absolute cost matrix, which is defined as C y,u = |y − u| , is a frequent choice. When applying regression techniques to F DEP estimation, each voltage is treated as a rank.
Machine learning aided pearl chain detection from DEP micrographs. The DEP framework device ( Fig. 1) comprises flexible textile electrodes sewn through a silicon O-ring (ID: 1 mm, OD: 3 mm). The textile electrodes are silver-coated conductive string, 82% nylon, and 18% silver. This structure was mounted on a 1 × 1 inch glass slide. Strings were secured using copper tape, which acted as an electrical contact. Tests were performed by introducing 10 μL of fluid into the O-ring chamber. A 3D printed custom microscope stage encloses the whole gadget for recording pictures. The pearl chain formations were recorded at different voltages at a fixed frequency of 200 kHz (Fig. 1b). During our dielectrophoresis experiments with yeast cells and 10-20 µm sized PS microbeads using this setup, 200 micrographs were collected at each voltage level from 1-10 V for yeast cells and polystyrene microbeads, making a sum of 4000 images.
Yeast cells (Saccharomyces cerevisiae) are grown in an incubator at 30 °C. The growth medium yeast extract peptone dextrose consisted of 20 g/l peptone, 10 g/l yeast extract, and 20 g/l dextrose dissolved in deionized (DI) www.nature.com/scientificreports/ water. The cells were collected at the stationary growth phase after 1 day of culture in shaking incubator, and they were harvested by centrifugation for 2 min at 3000 rpm and re-suspended in measurement buffers. Plain polystyrene (PS) beads (10, 20 µm) were purchased from Spherotech, Inc., USA. PS beads are charge neutral and are hydrophobic. There was no surface functionalization used. The buffer did not include any surfactant. Low conductivity buffer: All the microparticles were suspended in an isotonic buffer consisting of 200 mM sucrose, 16 mM glucose, 1 MCaCl 2 , and 5 mM Na 2 HPO 4 in DI water (pH 7.4) for the experiments.
Feature extraction for machine learning based regression analysis. We designed a template matching algorithm for object detection using OpenCV (algorithm I) to extract the total number of pearl chains in an image, count each pearl chain and map them into a matrix that represents these features. (Fig. 2). Pearl chains are identified within the image using reference shapes which are the cropped images of individual microparticles which is the recognition template. The image dimensions of the template image are also extracted i.e. height, width, to calculate the radius of the microparticle. The radius of a sample pearl in unit of pixels is calculated as r = (l + b)/4 where l is the length and b is the breadth of the template image of microparticle. In the formula c is a constant which is fixed at 1/4 of r . The value of c can be corrected until the output data set includes the data of undetected pearl chains.
The input image is represented as I(x, y) , with (x, y) denoting the pixel coordinates. T x ′ , y ′ denotes the coordinates of each pixel in the template. Template-Based Matching is done by simply moving the center (or the origin) of the template T x ′ , y ′ over each (x, y) point in the input image and calculate the sum of products between the coefficients in I(x, y) and T x ′ , y ′ , over the whole area spanned by the template. As all possible positions of the template with respect to the input image are searched, the position with the highest score is the best position. In the OpenCV implementation, for each location of T over I, we store the cross correlation metric (TM_CCORR_NORMED) in the result matrix R. The cross correlation metric (TM_CCORR_NORMED) used is depicted mathematically as R(x, y) in Eq. 6. Each location (x, y) in R contains the match or cross correlation score, which is the result of sliding the patch with a metric TM_CCORR_NORMED . The brightest locations indicate the highest matches.
The method matchTemplate() in the OpenCV library was used to compare the template image with the input images. The external libraries cv2, numpy, glob and workbook were also used. The detected microparticles are marked and corresponding coordinates are stored. Individual pearls are marked in the image using imwrite() method. The coordinates obtained are combined with the value of the radius of the pearl, which is then used to identify the pearl chain. After the function finishes the comparison, the best matches can be found as global maximums (TM_CCORR_NORMED) using the minMaxLoc function. In case of a color image, template summation in the numerator and each sum in the denominator is done for all the channels. The result will still be a single-channel image, which is easier to analyze. The center coordinate of each microparticle is extracted from the coordinate data set. Each coordinate is used to search for the adjacent microparticle using the condition 2r + c . All the microparticles nearest to a chain are identified and grouped to a dataset, duplicates are removed, and pearl chains are categorized into pearl chain count, C L where L = [2, 3, 4 . . . 18] . Identified pearl chains are stored in a list, which at the end of processing the pearl chain length and count is stored in an excel sheet. The Figure 2. Pearl chain analysis using a Template-Based Matching algorithm and particle coordinate search algorithm. A template image is shifted across the DEP micrographs by an offset (x, y) using the origins of the two images as reference points. Pearl chain lengths (Li) is determined using a particle coordinate search algorithm. www.nature.com/scientificreports/ precision of image detection can be controlled by changing the values of the threshold in the code. Method excelWrite() is used for representing the data of bulk image processed in excel sheet.
Machine learning models trainings. Prediction of target variable (applied voltage) was done using 18 predictors-which are the pearl chain features. Each predictor represents the number of particles in a pearl chain. Predictors of our model were extracted from the micrographs (Fig. 2). The value of each predictor is the number of microparticles in a pearl chain at an applied voltage. Pearl chain count, C L where L = [2, 3, 4 . . . 18] represents the number of pearl chains of a specific chain length L , C L values for all images taken at different voltages are stored in as a matrix and used as features or predictors in order to represent the DEP force. The pearl chain formations were recorded at different voltages. We hypothesize that pearl chain counts C L in a micrograph at , is a thorough representation of DEP micrographs. Evident from the micrographs, pearl chain formations were observed at voltages as low as 2 V. However almost all the pearl chains had not more than 2 microparticles ( C 2 ). At 3 V, 84% of the pearl chains were C 2 and 15% of them were C 3 . For voltages beyond 5 V, ~ 40% of the pearl chains have more than 4 microparticles (C 4 ) . Above 7 V, C 8 − C 10 is significant (6.7%). In the 7-10 V range, C 2 − C 5 percentages were very low and majority of the pearl chains had more than 8 microparticles C 8 . ML analyses were performed using the Orange toolbox by writing Python scripts accessing the Orange API. Additional functionalities like feature importance were developed using Python Script widgets. 80% of dataset is assigned as training data set and 20% to testing. Missing values were replaced with the median value of the features. Features with higher dominance in predicting the targets are identified from training sets. As shown in Fig. S1, different ML architectures such as K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Random Forest, Neural Networks, and Linear Regression were trained on the dataset extracted from the PS microbeads micrographs. The Python scripts used for the machine learning is made available via this Github Link (https:// github. com/ skmid hun09/ image_ detec tion_ python).
Feature importance estimation for maximum relevance and minimum redundancys. Extraneous features degrade the performance of a model while also increase computing costs. It is critical to find a subgroup of high-prevalence features. Some of the features have a considerable impact on the response model than others. We ranked the importance of features or predictors using RReliefF algorithm with k-nearest neighbors. RReliefF is a function that works with continuous target. RReliefF penalizes features who offer different values to neighbors with the same response values, and rewards features who give different values to neighbors with different response values. Figure 3 shows the features importance ranking obtained by implementing RReliefF . Among the C L predictors where L = [2, 3, 4 . . . 18] , C 8 was found to be the most important feature with a weight of 0.48, followed by C 10 and C 11 with importance weight value of 0.37 and 0.36 respectively. C 1 had the least score of 0.06 and C 12 − C 14 were found to be insignificant.

Convolutional neural networks as base architecture for deep regression. Convolutional Neural
Networks (CNN) was used to extract local trends from spatio-temporal patterns of pearl chain formation. CNNs have at least one layer that uses the convolution operation to extract features 57,58 . CNNs are used in image processing applications, including automated histopathological image segmentation 59 , automated reconstruction of low-contrast image such as magnetic resonance imaging (MRI) 60,61 , quantify cyanobacteria from hyperspectral images 62 , medical image processing for direct disease diagnosis 63,64 , as well as in other disciplines including speech recognition 58,65 and weather forecast 57,66 .
We have used four pre-trained CNN architectures viz. AlexNet 67 , MobileNetV2 68,69 , GoogLeNet 70 and ResNet-50 71 as the base architectures for deep regression analysis. Table 2 presents a brief overview of these pre-trained CNN architectures. All these architectures were initialized as pre-trained version of the networks which were initially trained on ImageNet dataset for classification. As illustrated in Fig. 4, the pre-trained CNN architectures consist of an input layer, which represents the pixel matrix of an input image, followed by a series of convolution layers that uses Rectified Linear Unit (ReLU) activation. Between two convolution layers is a pooling layer, where max pooling operation is done to down-sample the convoluted image (feature map). Subsequently a fully connected layer where all the inputs are connected along with softmax layers. In order to retrain these pre-trained networks for regression, we remove the last softmax layers from the base architectures (AlexNet, MobileNetV2, GoogLeNet and ResNet-50), employed in the context of classification, and then replace the final fully connected layer, the softmax layer, and the classification output layer with a fully connected layer of size 1 (the number of output variable) with linear activations and a regression layer. As a result, the last layer is a regression layer, whose output dimension corresponds to that of the target space. www.nature.com/scientificreports/ Notable hyperparameters such as the learning rate α and batch size n b were appropriately tuned to minimize the cost function and speedup optimization while ensuring the models converge to the global minimum, thereby solving the problem of overfitting 69,72 . Table 3 presents the CNN hyperparameters used. A series of adaptive learning rate algorithms have recently been developed, Adaptive Moments (Adam) 72 , root mean square propagation (RMSProp) 72,73 , and stochastic gradient descent with momentum (SGDM) 74 optimizers explored in this work are among the most widely used optimization algorithms. Table 4 presents a concise overview of these algorithms.    Table 4. Overview of optimization algorithms.

Optimizer Update Rule Description
RMSProp i. An extension of gradient descent called Adaptive Gradient, or AdaGrad ii. It avoids drastically lowering learning rates by converting the gradient accumulation to an exponentially weighted moving average iii. For that weight, RMSProp only considers recent gradients i. ADAM is an improvement to the RMSProp optimizer that incorporates momentum method ii. It is an algorithm for handling sparse gradients in noisy problems iii. ADAM is simple to set up, and the default settings work well for most problems www.nature.com/scientificreports/ The image files collected from micrographs of pearl chain formation at various voltages ranging from 1 to 10 V were used to perform deep learning analysis.
Image preprocessing and segmentation. In order to improve the computational time and accuracy, we applied an optimum adaptive threshold method to reduce the complexity of pearl chain images 75 . Figure 5 depicts the flowchart of the image analysis and segmentation technique. Also, we summarized all of the steps of the segmentation process in Algorithm II.
The MATLAB Image Processing Toolbox was utilized to prototype the methods for image processing in this research. Grayscale conversion, adaptive thresholding segmentation, and morphological operations were all part of the image processing procedures were carried out systematically for 4000 images 5,[76][77][78] . I(u, v) are grayscale images of the pearl chains in a Euclidean space E . Through intensive thresholding of the pearl chain regions from the micrograph, adaptive global threshold was employed to accomplish segmentation of the pearl chains. The threshold value selected was obtained through image histogram to produce the binarized output I β (u, v).
α is the adaptive threshold value applied to original input image I(u, v) to get the resultant image, denoted with I β (u, v) . In the next steps, a morphological operation called dilation is applied on I β (u, v) using structuring element S 1 (an array of horizontal and vertical lines). The dilation of I β (u, v) by S 1 is mathematically defined as in 76,77 by Eq. (8) and (9) below, where S 1 is the translation of the array S 1 by the vector z and ∅ is a null set: Now, performing a morphological closing operation defined in 77,78 as the erosion of I γ (u, v) by a horizontal structuring element S 2 , followed by dilation of the resulting image by S 2 , we obtain I δ (u, v) as shown below: Then, we fill the holes of the pearl chains and cleared its border. Then taking a morphological closing operation defined in 77 as the dilation of I δ (u, v) by a horizontal structuring element S 2 , followed by erosion of the resulting image by S 2 , we obtain as shown below: www.nature.com/scientificreports/ In the final step, the output segmented image O(u, v) is generated by concatenating I θ (u, v) thrice to form the equivalent true color (RGB) image.

Results and discussion
Model testing and evaluation metrics. The models were assessed for their performance by testing if pearl chain arrangements in a micrographs can be correlated to input voltages to find the model with the best performance using these four key performance metrics: Mean Absolute Error (MAE), Mean Relative Error (MRE), Mean Squared Error (MSE), R-squared, and Root Mean Square Error (RMSE) 48,49,57,79 . They are mathematically expressed as given by Eqs. (15)(16)(17)(18)(19)(20) where y , y , and y define the actual value, predicted value, and mean of the y values and n is the number of samples: The MAE assesses the average magnitude of errors in a group of predictions without taking into account their direction. It assesses the precision of continuous variables. MSE is often referred to as quadratic loss since the penalty is related to the square of the error rather than the error itself. When the error is squared, the outliers are given more weight, resulting in a smooth gradient for small errors. With an increase in error, MSE grows exponentially. The MSE value of a good model should be close to zero. RMSE is computed by taking the square root of MSE. RMSE is the more easily interpreted as it has the same units as the quantity. MAE, MSE, and RMSE can range from 0 to ∞. The goodness of fit of a regression model is represented by a statistical measure called R-squared. The optimal R-squared value is 1. The closer the R-square value is to 1, the better the model fits.
Deep regression model for dielectrophoretic force estimation. After image processing and segmentation steps, the images are resized to fit the input layer dimension for each model ( Table 2). These segmented image datasets are then augmented using augmentation procedures, such as randomly flipping them along the vertical axis and randomly translating them horizontally and vertically up to 30 pixels for training and validating the deep regression models (Fig. 6). Data augmentation keeps the networks from overfitting and ensure they adequately generalized. AlexNet, ResNet-50, MobileNetV2, and GoogLeNet were the four CNN architectures examined in this study. During the training phase of the models, the image datasets (2000 image samples each for yeast cells and PS microbeads) were partitioned into 80% (i.e. 1600 image samples each for yeast cells and PS microbeads) for the training and 20% (i.e. 400 image samples each for yeast cells and PS microbeads) for validation. The training was done in MATLAB R2021a, and the deep learning experiments were done with its Deep Learning toolbox. A DELL laptop with a five-core Intel 8th Generation processor served as our development system. The MATLAB code used for the deep learning described above is made available in this Githublink (https:// github. com/ Ajala Sunday/ Neural-Netwo rks-Fall2 021/ blob/ 16ac1 70830 74312 72fc2 c5c72 ee255 94d0a 81446/ Regre ssion Code.m).
By obtaining the R-squared, MAE, MRE, MSE, and RMSE values for the testing dataset base on the accuracy criteria, evaluation of the four architectures is performed. Tables 5 and 6 show the results achieved by the architectures and various optimizers in our experiments with yeast cells and microbeads respectively. As it can be seen in Table 5, all the models and optimization algorithms performed well above 95% based on the accuracy metric (also see Table S2). However, training the models on yeast cells dataset, ResNet-50 with RMSProp optimizer has the best validation RMSE of 0.0918 on test dataset, followed by the same ResNet-50 but with ADAM optimizer having a validation RMSE of 0.1241. This is also illustrated with the chart in Fig. 7. Figures S2 and S3 show the evolution of the validation RMSE for ResNet-50 with RMSProp optimizer and ResNet-50 with ADAM optimizer on the yeast cell dataset respectively while the regression lines are illustrated in Fig. 8a and b respectively.
On the microbeads dataset, as it can be seen in Table 6, all the models and optimization algorithms performed well above 90% based on the accuracy metric (also see Table S3 and S4). AlexNet with ADAM optimizer have the best validation RMSE of 0.1745 across all models on the PS microbeads dataset followed by ResNet-50 also with ADAM optimizer with validation RMSE 0.1869. This is also illustrated with the chart in Fig. 9. Figures S4 and  S5 show the evolution of the validation RMSE for AlexNet with ADAM optimizer and ResNet-50 with ADAM optimizer on the PS microbeads dataset respectively while the regression lines are illustrated in Fig. 10a and b respectively. A look at the performances of adaptive learning rate optimization algorithms explored in this work, we found that ADAM has the least sum followed by RMSProp and then SGDM come last, having the highest sum of RMSE on both datasets as shown in Fig. 11a and b. www.nature.com/scientificreports/   www.nature.com/scientificreports/  Figure 9. Model performance on microbeads dataset for various architectures AlexNet with ADAM optimizer has the best validation RMSE of 0.1745 on test dataset, followed by the same ResNet-50 but with ADAM optimizer having a validation RMSE of 0.1869. www.nature.com/scientificreports/

Conclusion
This paper presents an intelligent sensing framework capable of direct estimation of DEP force from pearl chain alignment of microparticles. We have tested the proposed models in an electrode-based dielectrophoretic system. The proposed deep regression models were extensively examined, and results were compared with conventional machine learning approaches. The intrinsic features of microparticle alignment like pearl chain length and count were extracted using image segmentation algorithms and used to generate training datasets. The results from the experiments show that the performance of the DL models proved to be optimal in terms of prediction accuracy and generalization ability compared to the ML models. ResNet-50 with RMSPROP gave the best performance, with a validation RMSE of 0.0918 on yeast cells while AlexNet with ADAM optimizer gave the best performance, with a validation RMSE of 0.1745 on microbeads. The regression model we developed can be extended to biosensing systems in order to estimate the variations in dielectric properties of microparticles.

Data availability
The dataset used for the current study is available on Dryad via this link. This is available to the reviewers but will be made available to the public after this article has been peered reviewed or upon request from the corresponding author.